Improved Dual Decomposition Based 
Optimization for DSL Dynamic Spectrum 

Management 

Paschalis Tsiaflakis*, Ion Necoara, Johan A. K. Suykens, Marc Moonen 
, Electrical Engineering, Katholieke Universiteit Leuven 

Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium 
t-L| ' Phone: +32 (0)16 32 18 03, Fax: +32 (0)16 32 19 70 

m 

email: {paschalis. tsiaflakis, ion.necoara, johan. suykens, marc. moonen} @esat.kuleuven.be 

Abstract 

Dynamic spectrum management (DSM) has been recognized as a key technology to significantly 
improve the performance of digital subscriber line (DSL) broadband access networks. The basic concept 
of DSM is to coordinate transmission over multiple DSL lines so as to mitigate the impact of crosstalk 
interference amongst them. Many algorithms have been proposed to tackle the nonconvex optimiza- 
tion problems appearing in DSM, almost all of them relying on a standard subgradient based dual 
^ • 

decomposition approach. In practice however, this approach is often found to lead to extremely slow 

m 



U 
O 



convergence or even no convergence at all, one of the reasons being the very difficult tuning of the stepsize 

parameters. In this paper we propose a novel improved dual decomposition approach inspired by recent 

advances in mathematical programming. It uses a smoothing technique for the Lagrangian combined 

with an optimal gradient based scheme for updating the Lagrange multipliers. The stepsize parameters 

are furthermore selected optimally removing the need for a tuning strategy. With this approach we show 
■ 

how the convergence of current state-of-the-art DSM algorithms based on iterative convex approximations 

H ■ 

(SCALE, CA-DSB) can be improved by one order of magnitude. Furthermore we apply the improved 
dual decomposition approach to other DSM algorithms (OSB, ISB, ASB, (MS)-DSB, MIW) and propose 
further improvements to obtain fast and robust DSM algorithms. Finally, we demonstrate the effectiveness 
of the improved dual decomposition approach for a number of realistic multi-user DSL scenarios. 

EDICS: SPC-TDLS Telephone networks and digital subscriber loops, SPC-MULT Multi-carrier, OFDM, and 
DMT communications, MSP-APPL Applications of MIMO communications and signal processing 

This research work was carried out at the ESAT Laboratory of the Katholieke Universiteit Leuven, in the frame of K.U. Leuven 
Research Council: CoE EF/05/006, GOA AMBioRICS, FWO project G.0235.07('Design and evaluation of DSL systems with 
common mode signal exploitation'), FWO project G.0226.06, Belgian Federal Science Policy Office IUAP DYSCO. 



February 14, 2013 



DRAFT 



2 



I. Introduction 

Digital subscriber line (DSL) technology refers to a family of technologies that provide digital broad- 
band access over the local telephone network. It is currently the dominating broadband access technology 
with 66% of all broadband access subscribers worldwide using DSL to access the Internet |[TJ. The major 
obstacle for further performance improvement in modern DSL networks is the so-called crosstalk, i.e. the 
electromagnetic interference amongst different lines in the same cable bundle. Different lines (i.e. users) 
indeed interfere with each other, leading to a very challenging interference environment where proper 
management of the resources is required to prevent a huge performance degradation. 

Dynamic spectrum management (DSM) has been recognized as a key technology to significantly im- 
prove the performance of DSL broadband access networks [|2|. The basic concept of DSM is to coordinate 
transmission over multiple DSL lines so as to mitigate the impact of crosstalk interference amongst them. 
There are two types of coordination referred to as spectrum level and signal level coordination. Here, we 
will focus on spectrum level coordination, also referred to as spectrum management, spectrum balancing or 
multi-carrier power control. Spectrum management aims to allocate transmit spectra, i.e. transmit powers 
over all available frequencies (tones), to the different users so as to achieve some design objective. This 
generally corresponds to an optimization problem, where typically a weighted sum of user data rates is 
maximized subject to power constraints 0-J5], which will be referred to as "constrained weighted rate 
sum maximization (cWRS)". Recently this has been extended to other design objectives as well, such 
as power driven designs (green DSL ||7), ||8]) and other utility driven designs J9], iflOl . As shown 
in 0, the key component to these designs is an efficient solution for the cWRS problem. Therefore we 
will mainly focus on this problem and aim to find a robust and efficient solution for it. 

The cWRS problem is known to be an NP-hard, separable nonconvex optimization problem, that can 
have many locally optimal solutions ||9l ifTTll . Even for moderately sized problems (with 5-20 users and 
200-4000 tones), finding the globally optimal solution is computationally prohibitive. In @ and (4] the 
authors proposed to use a dual decomposition approach with a standard subgradient based updating of 
the Lagrange multipliers. Many DSM algorithms 0, JH, lfTTl - |[T8l have been proposed recently, almost 
all of them using this standard subgradient based dual decomposition approach. In practice, however, this 
approach is often found to lead to extremely slow convergence or even no convergence at all, especially 
so for large DSL scenarios with large crosstalk. One of the reasons is the very difficult tuning of the 
stepsize parameters so as to guarantee fast convergence. 

In this paper we propose a novel improved dual decomposition approach inspired by recent advances in 
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mathematical programming, more specifically the proximal center based decomposition method recently 
proposed in lfl9l . This method uses a smoothing technique for the Lagrangian that preserves separability 
of the problem, as recently proposed in [20 1. The corresponding stepsize is determined in an optimal way 
and so straightforwardly tuned. The method is originally designed for separable convex problems, whereas 
DSM optimization problems are highly nonconvex. In this paper we extend the proximal center based 
decomposition method to an improved dual decomposition approach for application in the context of DSM. 
With this approach, we show how the convergence of current state-of-the-art DSM algorithms based on 
iterative convex approximations (SCALE Ifl4l . CA-DSB [ 1 Q) can be improved by one order of magnitude. 
Furthermore we apply the improved dual decomposition approach to other DSM algorithms (OSB 0, 
PBnB [13, ISB d, ASB d], (MS-)DSB [H, MIW [H, BB-OSB [5]), again leading to much faster 
converging DSM algorithms. Then we demonstrate an important pitfall of applying dual decomposition to 
nonconvex DSM problems and propose an effective solution for this that further improves the robustness 
of current DSM algorithms. Finally we demonstrate the effectiveness of the improved dual decomposition 
approach for a number of realistic multi-user DSL scenarios. 

This paper is organized as follows. In Section JIJ the system model is introduced for the DSL multi- 
user environment. In Section JIIJ the basic cWRS problem is described and existing DSM algorithms 
for this problem are reviewed, all of them relying on a subgradient based dual decomposition approach. 
In Section IIV-AI an improved dual decomposition approach is proposed for DSM algorithms based on 
iterative convex approximations. The improved dual decomposition approach is furthermore applied to 
other DSM algorithms in Section ITV-B I In Section [V] the problem of obtaining a primal solution from the 
dual solution is described and an effective solution for it is proposed. Finally in Section ED simulation 
results are shown. 

II. System Model 

We consider a system consisting of M = {1, ... , N} interfering DSL users (i.e., lines, modems) with 
standard synchronous discrete multi-tone (DMT) modulation with K = {1, . . . , K} tones (i.e., frequencies 
or carriers). The transmission can be modeled independently on each tone k by 

The vector x& = [xL . . . , x^] T contains the transmitted signals on tone k, where x2 refers to the signal 
transmitted by user n on tone k. Vectors z^ and have similar structures; z^ refers to the additive noise 
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on tone k, containing thermal noise, alien crosstalk, radio frequency interference (RFI), etc, and y> refers 
to the received signals on tone k. is an N x N channel matrix with [H&] n)m = h^' m referring to the 
channel gain from transmitter m to receiver n on tone k. The diagonal elements are the direct channels 
and the off-diagonal elements are the crosstalk channels. 

The transmit power of user n on tone k, also referred to as transmit power spectral density, is denoted 
as s£ = AfE{\xfr\ 2 }, where Af refers to the tone spacing. The vector = {s^,n € M} denotes the 
transmit powers of all users on tone k. The vector s n = {s£, k € /C} denotes the transmit powers of user 
n on all tones. The received noise power by user n on tone k, also referred to as noise spectral density, 
is denoted as a% = A f E{\z%\ 2 }. 

Note that we assume no signal coordination at the transmitters and at the receivers, and that the 

interference is treated as additive white Gaussian noise. Under this standard assumption the bit loading 

for user n on tone k, given the transmit spectra of all users on tone k, is 

/ i \h n ' n \ 2 s n \ 
hi 4 bl{s k ) = log 2 1 + - ' k 1 k bits/Hz, (1) 

where F denotes the SNR-gap to capacity, which is a function of the desired BER, the coding gain and 
noise margin [21]. The DMT symbol rate is denoted as f s . The achievable total data rate for user n and 
the total power used by user n are equal to, respectively: 



(2) 

pn A 

fce/C 

III. Dynamic spectrum management 

A. Dynamic spectrum management problem 

The basic goal of DSM through spectrum level coordination is to allocate the transmit powers dynam- 
ically in response to physical channel conditions (channel gains and noise) so as to pursue certain design 
objectives and/or satisfy certain constraints. The constraints are mostly per-user total power constraints 
and so-called spectral mask constraints, i.e. 

pn < pn,tot n e M (total power constraints) 

nmask (3) 

< s 1 ^ < s ^ ,mas n £ jV ) fce/C (spectral mask constraints) 

where p n > tot refers to the total available power budget for user n and s ™> mask re fers to the spectral mask 
constraint for user n on tone k. The user total power constraints can also be written in vector notation 
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as P < P to \ where P = [P 1 , . . . , P N ] T and P tot = [P 1 '™, . . . ,P N ' tot ) T , and where '<' denotes a 
component-wise inequality. 

The set of all possible data rate allocations that satisfy the constraints © can be characterized by the 
achievable rate region 1Z: 

K = {(R n : n 6 M)\R n = f s J2 6 ^ s ^' S ' L ©}• 

A typical design objective is to achieve some Pareto optimal allocation of the data rates R n Q, (H, 
CD-lED, EH, 11221, lED- This results in the following typical DSM optimization problem, which will 
be referred as the constrained weighted rate sum maximization (cWRS) formulation, where w n is the 
weight given to user n: 



max > Wn R n 

{s»,neAO 



S.t. 



pn < pn.tcrt n e AT, (cWRS) (4) 



< si < 4' mask , n G M, k g /C. 



However, many other DSM formulations are possible. We refer to |6 ] containing a collection of other 
relevant DSM formulations. As shown in [6 |, the key component to tackling these is an efficient solution 
for cWRS problem (0). Therefore we will focus on this problem and aim to find a robust and efficient 
solution for it. 



B. Dynamic spectrum management algorithms 

cWRS problem dU is an NP-hard separable nonconvex optimization problem I0. The number of 
optimization variables is equal to KN, where the number of users N ranges between 2-100 and the 
number of tones K can go up to 4000. Depending on the specific values of the channel and noise 
parameters, there can be many locally optimal solutions, that can differ significantly in value, as shown 
in ifTTl . In H the authors show that strong duality holds for the continuous (frequency range) formulation, 
and in |9[ the authors prove asymptotic strong duality for the discrete (frequency range) formulation, 
i.e. the duality gap goes to zero as K — > oo. These results suggest that a Lagrange dual decomposition 
approach is a viable way to reach approximate optimality for the discrete formulation ©, if the frequency 
range is finely discretized, as it is indeed the case in practical DSL scenarios where K is large [4]. Many 
dual decomposition based DSM algorithms 0, O, lTTTI - |[T6ll , |[T8l have been proposed for solving dD, 
almost all of them using a standard subgradient based updating of the Lagrange multipliers. 
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The dual problem formulation of (H) consists of two subproblems, namely a master problem 

mm g(X) 
s.t. A > 

where A = [Ai, . . . , Xn] T , and a slave problem defined by the dual function g(X): 



(5) 



r max w nR n ~ Xn ( pn ~ pn ' t0t ) 

g(X)=C(X,S k ,kEK) = I ^' keK ^ neAf neM (6) 

^ s.t. 0<4<4' mask , n £ M,k £ IC, 
where C(X, s k , k E /C) is the Lagrangian. This can be reformulated as: 

r max V |/A(s k ) - V X n s n k \ + V A n P"> tot 

< s.t. 0<4<^' mask , n € W, /c € /C (7) 
with 6 fc (s fe ) = y^w n bl(s k ) 

The slave optimization problem ([/J can then be decomposed into K independent nonconvex subprob- 
lems (dual decomposition): 

5(A) = J>(A) 

f max /A(sfc) ~ Y, X n s t+Y. XnPnMt / K (8) 
with gk{X) = < fe neAA neN 

y s.t. < s£ < s"' mask , re € W 
The master problem d5j, also called the dual problem, is a convex optimization problem. Its objective 
function, i.e. the dual function 5(A), is however non-differentiable. The reason for this non-differentiability 
is that the underlying slave optimization problem © can have multiple globally optimal solutions for 
some values of the Lagrange multipliers A. In (4[ (3l a subgradient approach is proposed for this dual 
master problem, where the subgradient is defined as, 

d 5 (A)^^s fc (A)-P tot (9) 

fce/c 

with Sfc(A) referring to the optimal solution of (H) for given Lagrange multipliers A, also called dual 
variables, and the corresponding subgradient update is: 

+ 



A 



A + 5(J]s fe (A) 

keK. 



>tOtN 



(10) 



where [x] + denotes the projection of x 6 TZ onto and where the stepsize 5 can be chosen using 
different procedures |@], [|5], e.g. S = q/i where q is the initial stepsize and i is the iteration counter. By 
iteratively applying ( fTOl and ([8]), convergence to an optimal solution of (f5]) can be achieved, i.e. A —¥ A*, 



Febraary 14, 2013 



DRAFT 



7 



for which the complementary conditions, A n (^ fcg ^ s£(A) — P™' tot ) = 0, n € A/", are satisfied when 
strong duality "holds" (K — > oo). This general standard dual decomposition approach is visualized in 
Figure Q] 

Note that the per-tone subproblems (H) are nonconvex optimization problems. Many existing DSM 
algorithms differ only in the way these subproblems are solved, where strategies are proposed such as 
exhaustive discrete search (OSB) 0, branch and bound search (PBnB [12], BB-OSB O), coordinate 
descent discrete search (ISB) [13] |22), solving the KKT system (DSB El, MIW D3, MS-DSB EH), 
and heuristic approximation (ASB [16|, ASB2 ATI ). 



initialize A 



master 
problem 



slave 
problem 



subgradient update 
of Lagrange multipliers 



k=l k=2 k=3 k=K 



Sk, k e K, 



NO 



An(E»<$(A)-P°**)=0, Vn 



YES 



STOP 

SJ = ^(A)Vn,fc 
A = A 



Sk.k e K, 



Fig. 1. General structure of subgradient based dual decomposition approach for DSM 



An alternative approach is based on iterative convex approximations such as in SCALE 11141 and 
CA-DSB ifTTll . This approach basically consists of iteratively executing the following two steps: (i) 
approximating the nonconvex cWRS problem (J4j by a separable convex optimization problem F cvx , and 
(ii) solving this convex approximation by using a subgradient based dual decomposition approach. Note 
that under some conditions on the approximation, described in |[24l . iteratively executing these steps 
results in asymptotic convergence to a locally optimal solution of cWRS ©. The convex approximations 
used by CA-DSB and SCALE both satisfy these conditions. This approach is visualized in Figure |2] 
where cvx refers to the per-tone convex problem obtained from the convex approximation F cvx . We 
emphasize that these DSM algorithms also use a subgradient based dual decomposition approach to solve 
a convex optimization problem in each iteration. This step requires the major part of the computational 
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NO 



sj.. k E K, 



NO 



converged to locally 
optimal solution 
of cWRS? 



YES 



YES 



STOP 

A = A 



Fig. 2. Structure of iterative convex approximation approach for DSM 



IV. Improved dual decomposition 

In practice, the standard subgradient based dual decomposition approach is often found to lead to 
extremely slow convergence or even no convergence at all, especially so for large DSL scenarios (6-20 
users) with large crosstalk (VDSL(2)). This is because of different reasons: (i) subgradient methods are 
generally known not to be efficient, i.e. showing worst case convergence of order O(p-) with e referring 
to the required accuracy of the approximation of the optimum EDI , (ii) the stepsize used by subgradient 
methods is quite difficult to tune in order to guarantee fast convergence, (iii) the nonconvex nature of 
the problem implies that special care should be taken in obtaining the optimal primal variables from the 
optimal dual variables. 

For separable convex problems, i.e. with a separable convex objective function but with convex coupling 
constraints, several alternative dual decomposition approaches have been proposed such as the alternating 
direction method [25], proximal method of multipliers [26 1 , partial inverse method ||27| . etc. Here, we 
focus on a recently proposed dual decomposition approach in |[T9l , referred to as the proximal center 
based decomposition method. This method shows interesting properties, namely it preserves separability 
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of the problem, it uses an optimal gradient based scheme, and it uses an optimal stepsize which leads to 
straightforward tuning. In this section we extend this method to an improved dual decomposition approach 
for solving cWRS (0]). This approach will be used first in Section lTV-Al to improve the convergence of DSM 
algorithms using iterative convex approximations (SCALE, CA-DSB) with one order of magnitude. In 
Section ITV-Bl this will be extended to other DSM algorithms such as OSB, ISB, PBnB, BB-OSB, ASB, 
(MS-)DSB, MIW, etc. We will refer to these DSM algorithms that are not based on iterative convex 
approximations as "direct DSM algorithms". 

A. An improved dual decomposition approach for iterative convex approximation based DSM algorithms 

Two state-of-the-art DSM algorithms that are based on iterative convex approximations are SCALE 
and CA-DSB. These basically consist of two steps as explained in Section IIII-BL which are iteratively 
executed. In this section we will propose an improved dual decomposition approach for solving the 
convex optimization problem in the second step. We will elaborate this for CA-DSB and explain how 
its convergence speed is improved by one order of magnitude, i.e. from O(-p) to 0{\). The improved 
dual decomposition approach can similarly be applied to the SCALE algorithm to obtain a similar speed 
up, but requires more complicated notation because of the inherent exponential transformation of variables. 



For CA-DSB, the convex approximation in each iteration is obtained by reformulating the objective 
of cWRS, as a sum of a concave part and a convex part, and then approximating the convex part by 
a first order Taylor expansion. The resulting convex approximation, its dual formulation, dual function, 
and Lagrangian are given in ([Til . flU, CGL and (TT41 . respectively. 

/* vx = { max Y> )CVX (s fc ) s.t.J>2 < P"< to \ n G M] (11) 
{ Sk eS k ,keK} keic 

min 5cvx(A) (12) 
A^O 

5cvx(A) = max C cvx (s k , k G /C, A) (13) 

{s k es k ,keic} 

£ cvx (s k , k G K, A) = J>, cvx (s fc ) A » s fe + E XnPn ' t0t (14) 

fcG/C keK n£Af neJV 

where S k = {s k G TZ n : < s'1 < s^' max ,n G M} is a compact convex set with s ^' max := 

n, 



min(s^' mask , P n ' tot ) and p n < tot < oo, and where 6fc iCVX ( s fc) is concave and given as: 



6 fc ,cvx(sfc) = Sw.iogaCEi^-r+r^)- E^/*(E a r ns ™+ c 2)> (is) 

neAf m£j\f n&N m^n 
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with a k ' m ,djt,\/n,m,k constant approximation parameters, obtained by a closed-form formula in the 
approximation step iTTTI . and with 

|ir J=WT, n*m 
I = \h k ' | z , n = m. 

The convex problem (TTTb has a separable structure and so the standard way to solve it is by focusing 
on the dual problem (TT2T i and using a subgradient update approach for the dual variables. This subgradient 
based dual decomposition approach is however known EUl to have a convergence speed of order 0{\), 
where e is the required accuracy for the approximation of the optimum. In the sequel, it will be shown 
how the "proximal center based decomposition" method from |fl9l can be adapted for solving the convex 
approximation, leading to a scheme with convergence speed of order 0(~), i.e. one order of magnitude 
faster but with the same computational complexity. The basic steps in this result are as follows. First an 
approximated (smoothed) dual function g CV x(A) is defined that can be chosen to be arbitrarily close to the 
original dual function g cvx (X). Then it is proven that this smoothed dual function g cvx is differentiable 
and has a Lipschitz continuous gradient. Finally an optimal gradient scheme is applied to this smoothed 
dual function. 



We introduce the following functions d k (s k ),k G /C, which are called prox-functions in |[T9l and are 
defined as follows: 

Definition 1: A prox-function d k (s k ) has the following properties: 

• d k (s k ) is a non-negative continuous and strongly convex function with convexity parameter crs k 

• d k (s k ) is defined for the compact convex set S k 

An example of a valid prox-function is d k (s k ) = |||s&|| 2 , which is also used in our concrete implemen- 
tations (see Section [VTJ). As many other valid prox-functions exist, and in order not to loose generality, 
we continue with d k (s k ). Since S k , k € fC, are compact and d k (s k ) are continuous, we can choose finite 
and positive constants such that 

D Sk > max d k (s k ),k € K. (17) 

The prox-functions can be used to smoothen the dual function g CV xW to obtain a smoothed dual 
function (fcvx(A) as follows: 

5 CT x(A)= max V \b k , cvx (s k ) - V A n (s£ - P n > tot /K) - cd k (s k )\, (18) 
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where c is a positive smoothness parameter that will be defined later in this section. By using a sufficiently 
small value for c, the smoothed dual function can be arbitrarily close to original dual function. Note that 
we can also choose different parameters c k for each prox-term. The generalization is straightforward. 

One useful property of the particular choice of prox-f unctions is that they do not destroy the separability 
of the objective function in (TT8T ), i.e. 

5cvx(A) = V { max b kjCVX (s k ) - V X n (s n k - P n ^/K) - cd k (s k )\. (19) 

Denote by ifc )CVX (A), k 6 /C, the optimal solution of the maximization problem in (fT9l . The following 
theorem describes the properties of the smoothed dual function (fcvx(A): 

Theorem 1 (/[79l/)/ The function g C vx(A) is convex and continuously differentiable at any A € TV 1 . 
Moreover, its gradient Vg cvx (A) = XlfceAC Sfc,cvx(A) — P tot is Lipschitz continuous with Lipschitz constant 
L c = XlfcgA: ■ ^ ne following inequalities also hold: 

5cvx(A) < 5 cvx(A) < 5cvx(A) + cJ2 D Sk A e lZ n (20) 

The addition of the prox-functions thus leads to a convex differentiable dual function with Lipschitz 
continuous gradient. Now instead of solving the original dual problem (fT2l . we focus on the following 
problem 

min<7 cvx (A) (21) 

Note that, by making c sufficiently small in ( fT9l) , the solution of (1211 ) can be made arbitrarily close to 
the solution of (fT2l) . Taking the particular structure of (ETT) into account, i.e. a differentiable objective 
function with Lipschitz continuous gradient, we propose the optimal gradient based scheme given in 
Algorithm [T] derived from lTl9l , for solving (fiTY This algorithm will be referred to as the improved dual 
decomposition algorithm for solving the convex approximation of CA-DSB (fTTb . 

The specific value for L c depends on the chosen prox-function djt(sfc), as given in Theorem Q] The 
specific value for c will be defined later in Theorem [2] Note that lines 6-10 of Algorithm Q] correspond 
to the improved Lagrange multiplier updates. By comparing this with the standard subgradient Lagrange 
multiplier update (fTOl . one can observe that the standard and improved update require a similar complexity. 

The remaining issue is to prove that s k ,k G /C, converges to an e-optimal solution in i max iterations 
where i max is of the order O(-). For this we define the following lemmas that will be used in the sequel. 
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Algorithm 1 Improved dual decomposition algorithm for solving ([TO for CA-DSB 

1: i : = 0, tmp := 

2: initialize i max 

3: initiahze \ l 

4: for i = . . . z max do 

5: VA; : s* fc +1 = argmax b ktCVX (s k ) - £ A^s£ - cd k (s k ) 
{s k es k } nejKf - 

6: dgl +l = ^2s k +1 - P tot 

fce/c 

7: = + + 

8: tmp := tmp + ^dg^ 1 



vi+ l = [tmp] 



10: X i+1 = gju* 1 + jjg V^ 1 

ll: i := i + 1 
12: end for 

13: Build A = A— +> and s t = Etc (w+gfeL+i) '" 



Lemma 1: For any y G 7£" and z > 0, the following inequality holds; 

y T z < ||[y] + ||||z|| (22) 

Proof: Let us define the index sets Z~ = {i € {1 . . . n} : y\ < 0} and X + = {i € {1 . . . n} : yi > 0}. 
Then, 

y T z = v%Zi + £ ViZi - £ ViZi = ([y] + ) Tz ^ II M + II INI- 

iex- iex+ iex+ 



The following lemma provides a lower bound for the primal gap, /* vx — Ylk&K ^fc,cvx(sfc), of (TTTb : 
Lemma 2: Let A* be any optimal Lagrange multiplier, then for any € S k , k E K,, the following 
lower bound on the primal gap holds: 

/™ - £ & fc ,cvx(s fc ) > -IIAIUE h - P tot ]+|| (23) 

kelC keK. 

'For the sake of an easy exposition we consider in the paper only the Euclidian norm || ■ ||, although other norms can also be 
used (see [19] for a detailed exposition). 
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Proof: From the assumptions of the lemma we have 

/* vx = max V 6 fciCVX (s fc ) - A* T ( V s fe - P tot ) > V 6 fc , cvx (s fc ) - A* T ( V h - P tot ) (24) 

and then (l23l is obtained by applying Lemma [JJ 

■ 

From Lemma 12 it follows that if [lEfce/C*** ~~ P tot ] + || < e c , then the primal gap is bounded, i.e. for 
all A G 1Z+ 

~ Cell A* || < /* vx - ^ b k,cv X (h) < Scvx(A) - 22 & fc,cvx(Sfc)- (25) 

fce/c fce/c 

Therefore, if we are able to derive an upper bound e for the dual gap, namely g C vx(A) — 2~2keK frfc,cvx(sfc), 
and an upper bound e c for the coupling constraints for some given A (> 0) and £ «Sfc, V/c, then we can 
conclude that is an (e, e c )-solution for (fTTb (since in this case — e c || A* || < /* vx — J2ke/C ^fc,cvx(sfc) < 0- 
The next theorem derives these upper bounds for Algorithm [TJ and provides a concrete value for c. 

Theorem 2: Let A* be an optimal Lagrange multiplier, taking c = ^ — ^ — and 
Wx + 1 = 2y / (^ fc -^-){>~2k AsJ^, then after i max iterations Algorithm [TJ obtains an approximate 
solution Sfe, k £ K, to the convex approximation (fTTb with a duality gap less than e, i.e. 

5cvx(A) - ^2 ^fc,cvx(sfc) < e, (26) 
fce/c 

and the constraints satisfy 

|lEs fe -P tot ]+|| <e(||A*|| + Vl|A*|| 2 + 2) (27) 

fc 

Proof: Using a similar reasoning as in Theorem 3.4 in lTT9l we can show that for any c the following 
inequality holds: 

5c VX (A) < mm{ - 2L ; 1)2 I|A|| 2 + f |g CT x(A*) + (Vsc VX (A l )) T (A - A*)]} 

a^u \tmax * L ) • „ l/rnax t J-M'max ~r ^) 

Replacing g cvx (X l ) and Vg cvx (A l ) by their expressions given in (fT8l l and Theorem [JJ respectively, and 
taking into account that the functions frfc )CVX are concave, we obtain the following inequality: 

IT 

5cvx(A) -5>fe,cvx(sfe) < cC£Ds h ) +mm{- f^||A|| 2 - (A, J> -P tot )} 

fceK fceK A "° V™* + L > k 

= <E - ( * m g L +1)2 iiLl> - p tot ] + n 2 < <J2 °s>)- 

fce/c c fc fce/c 

By taking c = — St — > we obtain (|26l ). For the constraints using Lemma |2] and the previous inequality 

we get that HEfc^fc ~~ P tot ] + || satisfies the second order inequality in y: ^^-^ y 2 — ||A*||y — e < 0. 
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Therefore, ||[X}fcSfc ~~ P tot ] + || must be less than the largest root of the corresponding second-order 
equation, i.e. 




2L C (*max ~i~ 1}^ 



By taking i max = 2j(^ k ^)(E fc D Sk )± - 1, we obtain ®. 



From Theorem|2]we can conclude that by taking c = ^ — L ~n — > Algorithm [JJ converges to a solution with 
duality gap less than e and the constraints violation satisfy — P tot ] + || < e(||A*|| + ^/||A*|| 2 + 2) 

after i max = 2^/(^ fc jj^-)Q^ Asjjj ~~ 1 iterations, i.e. the convergence speed is of the order O(^). 

Note that Algorithm Q] provides a fully automatic approach, i.e. it requires no stepsize tuning, which 
is otherwise known to be a very difficult and crucial process. Finally note that combining this algorithm 
with an outer loop that iteratively updates the convex approximations leads to an overall procedure that 
converges to a local maximizer of the nonconvex problem cWRS ll24l ifTTI . The extension of CA-DSB 
with the improved dual decomposition approach will be referred to as Improved CA-DSB (I-CA-DSB). 

A final remark on Algorithm [T] is that the independent convex per- tone problems (line 5 of Algorithm [T) 
are slightly modified with respect to the standard per-tone problems for CA-DSB. This is a consequence of 
the addition of the extra prox-function term. One can use state-of-the-art iterative methods (e.g. Newton's 
method) to solve this convex subproblem with guaranteed convergence. An alternative consists in using 
an iterative fixed point update approach, which is shown to work well, with very small complexity, and 
is easily extended to distributed implementation by using a protocol |[l4l iTTTTl . The fixed point update 
formula for the transmit powers s k used by CA-DSB can be adapted so as to take the extra prox-term 
into account. Following the same procedure as explained in IfTTI . consisting of a fixed point reformulation 
of the corresponding KKT stationarity condition of (fT2l . we obtain the following transmit power update 
formula, that only differs in the presence of the term PROX: 

W n / s /k)g(2) \ m^n 



X n+ 2cst + £ u> m f s a n k > m - £ ^» J Kr 

v -v - "' — ' — > \h m ' p \ 2 s p +r<T m 

PROX m ^ n m^n / y fc I * fc 

V 

(28) 



Providing convergence conditions for this type of iterative fixed point updates is outside the scope 
of this paper. In IfTTI . |fT6l , |fl"8l , convergence is proven under certain conditions, and demonstrated for 
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realistic DSL scenarios. This leads to an alternative and fast way of implementing line 5 of Algorithm 
[H as specified in Algorithm |2] The number of iterations in line 2 is typically fixed at 3. Note that a 
distributed solution is also possible for the full scheme as the dual decomposition approach is decoupled 
over the users, (see [11] for more details). 

Algorithm 2 Iterative fixed point update approach for solving line 5 of Algorithm Q] 
l: for k = 1 . . . K do 

2: for iterations do 

3: for n = 1 . . . N do 

4: sl H28J 

5: end for 
6: end for 
7: end for 



As mentioned, although the improved dual decomposition approach has been elaborated for CA-DSB, 
it can similarly be applied to other DSM algorithms based on iterative convex approximations, like for 
instance SCALE, with a similar speed up of convergence. In this case the prox-function can be taken 
as dfc(sfc) = ||sfc|| 2 , resulting in concrete values for c, i max and L c . The extension of SCALE with the 
improved dual decomposition approach will be referred to as Improved SCALE (I-SCALE). 

B. An improved dual decomposition approach for direct DSM algorithms 

In this section we extend the improved dual decomposition approach to direct DSM algorithms such 
as OSB, ISB, ASB, (MS-)DSB, MIW, etc, corresponding to the structure visualized in Figure Q] Using 
a similar trick as in Section IIV-A1 we define a smoothed dual function g(X) as follows 

r ^ r Y,fMsk) EE A « s £ + E A « pn,tot - E c *( s *) 

0(A) = < S * e ^> fee/C kelc kelCn&AT nGAf kdK, (29) 

y s.t. 0< s "<4' mask , ke)C,neM, 
where dk(sk) is a prox-function, which for instance can be chosen as c4(sfc) = llsfcll 2 , and c = — ^-^ — , 
with e the required accuracy, and L c = X^e;t 7j^~- 

Note that by choosing a sufficiently small value for c, the smoothed dual function g{\) can be made 
arbitrarily close to the original dual function g{\), i.e. g{\) pa g(X). 
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This results in the improved dual decomposition approach for direct DSM algorithms, given in Algo- 
rithm |3l where line 4 uses the following optimization problem: 

Sfc(A) = argmax/ s 6 fc (s fc ) - ^ A n s£ - cd k (s k ) 

neX (30) 
s.t. < si < s"' mask , n G TV, 



Algorithm 3 Improved dual decomposition approach for direct DSM algorithms 
1: '% := 0, tmp := 

2: initialize A* and e a (desired accuracy) 

3: while 3n : (abs(A;(^s£ - P™> tot )) > €a ) do 

keK. 

■i _ s /v 
5: = - P tot 



4: V/c : s^ +1 = s fc (A') obtained by solving (|30 



7: tmp := tmp + l -^-dg i+1 

9: = i±i u i+i + _^_ v m 

10: i:=i + l 

ll: end while 

12: Build A = A* and s k = s\,Vk G /C 



Algorithmic] uses a similar optimal gradient based scheme on the smoothed Lagrangian as in Algorithm 
[2 Again no stepsize tuning is needed. Besides the improved updating procedure for the Lagrange 
multipliers (lines 5-9), it involves a slightly different decomposed per-tone problem (l30l (line 4). This 
can be solved by using a discrete exhaustive search similar to OSB, a discrete coordinate descent 



method similar to ISB, or a KKT system approach similar to DSB/MIW/MS-DSB using (|28|), where 
»\°»2r m CD - ® ne can a l so use a virtual reference length approach similar to ASB, 
ASB2. Note that for ASB, and when using dk(s k ) = ||sfc|| 2 , this increases the complexity as a polynomial 
equation of degree 4 is then to be solved instead of a cubic equation. Depending on the choice of the 
algorithm for solving the per-tone problem, there will be a trade-off in complexity versus performance 
ifTTTl . We will again add the prefix T-' to refer to these algorithms using the improved dual decomposition 
approach, i.e. I-OSBJ-ISB, I-DSB/MIW, I-MS-DSB, I-ASB. 
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The main difference of Algorithm [3] is that line 4 now involves K nonconvex optimization problems, 
while line 5 of Algorithm Q] involves K (strong) convex optimization problems. As a consequence, the 
smoothed dual function g(X) is not necessarily differentiable and its gradient is not necessarily Lipschitz 
continuous. More specifically, this is the case when g{X) has multiple globally optimal solutions for a 
given Lagrange multiplier A. This specific condition however mainly occurs for a particular type of DSL 
scenarios which are analyzed and discussed in Section [V] For these scenarios the worst case convergence 
of order O(^) can not be guaranteed, as in Theorem [2] but still we can expect an improved convergence 
behaviour with respect to the standard subgradient approach. Except for these specific cases, and so 
for most practical DSL scenarios, the smoothed dual function g(X) will be differentiable and Lipschitz 
continuous, and so a worst case convergence speed of O(-) is guaranteed. For instance, in ll28l conditions 
on the channel and noise parameters were given under which cWRS can be "convexified". For these 
conditions, differentiability and Lipschitz continuity holds for g(X) and so application of Algorithm [3] 
will provide a worst case convergence of O(-). 

V. An interleaving procedure for recovering the primal solution from the dual 

SOLUTION 

The subgradient based dual decomposition approach for solving problem cWRS (0]) as well as the 
improved dual decomposition approach presented in Sections IIV-AI and IIV-B[ converge to the optimal 
dual variables. However, because of the nonconvex nature of cWRS, extra care must be taken when 
recovering the optimal primal solution, i.e. optimal transmit powers s£, k G /C, for (01), from the optimal 
dual variables A*, as was also mentioned in J4[ |[29l . The fact that the objective function of cWRS is 
not strictly concave, can result in cases where the optimal Sk(X*),k G K,, that solves ([7]) is not unique, 
leading to multiple solutions s k {X*),k G /C, for given optimal dual variables A*. Formally this can be 
expressed as follows: 

{s k (X*), k G JC} G B = {(s M , k G K), . . . , {s k ,\B\,k G £)} 

with Sfc jm G <Sfe, k G /C, and £(sfc )ni , k G /C, A*) = max C(s k , k G fC, A*), m G {1, . . . , \B\}, 

{sfcg5fc,fce/C} 

(31) 

where the cardinality of set B is larger than 1, i.e. \B\ > 1. It is important to note that the elements of 
B are not necessarily solutions to (H), i.e. they do not necessarily satisfy the user total power constraints 
(O. However, there exists at least one element in set B that does satisfy the total power constraints |@]. 
In order to obtain convergence to a primal optimal solution for (0]) in the case that \B\ > 1, the dual 
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decomposition approach has to be extended with an extra procedure that chooses an element out of set 
B that satisfies the user total power constraints. 

A simple example may be given to clarify this issue; suppose we have a DSL scenario consisting of 
two users (N = 2) and two tones (K = 2), where the channel matrices (direct and crosstalk components) 
and noise components for the two tones are the same, i.e. Hi = H2 and a\ = , n G N, and the 
weights are also the same w\ = W2- Furthermore suppose the crosstalk components are very large. In this 
case, there will be only one user active on each tone [30 1. Finally suppose that the optimal dual variables 
A*, A?j, where \\ = are given and the total power constraints are P n < ON, where ON is a fixed 
power level. For this setup there will be 4 possible solutions to f7]), namely {s\ = ON, s3, = ON, s\ = 
0, s 2 2 = 0}, {s\ = 0, s\ = 0, s\ = ON, s 2 2 = ON}, {s\ = ON, s\ = 0, s\ = 0, s\ = ON}, {s\ = 0, s\ = 
ON, sf = ON, si = 0}. Note that all these solutions correspond to exactly the same objective value but 
only the last two solutions are primal optimal solutions as they satisfy the user total power constraints. 
Typical DSM algorithm implementations, however, have a fixed exhaustive search order or iteration order 
over tones so that one of the two first solutions may be selected and, as a consequence, these algorithms 
will not provide the primal optimal solutions of d4j. To obtain convergence to the optimal primal variables 
of ((U) an extra procedure should be added to the dual decomposition approach. 

Note that the above problem is practically only relevant when the phenomenon of non-unique globally 
optimal solutions Sfc(A*) occurs at many tones. This is the case for DSL scenarios that have a subset of 
strong symmetric crosstalkers with equal line lengths, i.e. lines that generate the same interference to their 
environment over multiple tones k, with equal weights w n and user total power constraints P"' tot . Here, 
we can have many subsequent tones with multiple globally optimal solutions, namely where only one 
of the subset of strong crosstalkers is active ll30l . If no special care is taken when recovering the primal 
transmit powers, this can lead to extremely slow convergence or even no convergence at all for these 
scenarios. More specifically, a fixed exhaustive search order or iteration order in typical DSM algorithm 
implementations will choose the same strong crosstalker over all competing tones, instead of equally 
dividing the resources over the competing users. 

To overcome this problem we propose a very simple, but effective, interleaving procedure. More 
specifically this solution consists of alternatingly on a per-tone basis, giving priority to the globally 
optimal solution that corresponds to a different active strong crosstalker of the symmetric subset. This 
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interleaving procedure replaces line 4 of Algorithm |3]with the following: 



\/k : I 




index 



{all globally optimal solutions §k(A) of (|30f) for given A} 
{C k (l),...,Ck(\C k \)}, 
rem(fc, \C k \) + 1, 
C k (index), 



(32) 



where 'rem(/c, \C k \)' refers to the remainder after dividing k by |Cjt|. As the suggested solution requires 
that all globally optimal solutions in the first step of (1321 actually be computed, it should be combined 
with algorithms for the per-tone nonconvex problem that indeed compute all these solutions such as OSB 
with a fixed order exhaustive search for all tones or a multiple starting point approach such as MS-DSB 
with a fixed iteration order for all tones. 

In the simulation Section [V]J it will be demonstrated how the usage of d32l significantly improves the 
robustness of the dual decomposition approach for cWRS. 

Remark: The above mentioned non-uniqueness also has an impact on the Lipschitz continuity condition 
of the smoothed gradient. More specifically this condition reduces to |[T9l : 



For the above two-user two-tone symmetric strong crosstalk example, this condition does not hold. This 
can be shown as follows. Let us compare two cases: (1) optimal dual variables (A*, A^ + ^u) corresponding 
to primal variables {s\ = ON, s}, = ON, sf = 0,s| = 0}, (2) optimal dual variables (A* + ^AJj) 
corresponding to primal variables {s\ = 0, = 0, s\ = ON, s| = ON}, where /x > 0. For very small /i 
these two cases have only slightly different dual variables but completely different primal variables. So 
a small change in Lagrange multipliers can lead to a large change in primal variables. This means that 
for these specific cases Lipschitz continuity (f33T > is not satisfied and so the convergence speed will be 
worse than O(^). However adding the interleaving trick alleviates this problem, as will be demonstrated 
in Section [VTJ 

VI. Simulation results 

In this section, simulation results are shown that compare the performance of the improved dual 
decomposition approach with respect to the subgradient based dual decomposition approach. More 
specifically, in Section IVI-AI we demonstrate the convergence speed-up in using the improved dual 
decomposition approach with respect to the subgradient based dual decomposition approach for a DSM 
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algorithm based on iterative convex approximations (CA-DSB). In Section IVI-BI we demonstrate how 
the improved dual decomposition approach in combination with a direct DSM algorithm (MS-DSB) 
succeeds in providing much faster convergence than with the subgradient based dual decomposition 
approach. Furthermore the convergence improvement for the interleaving procedure presented in Section 
|Vl is demonstrated. 

The following parameter settings are used for the simulated DSL scenarios. The twisted pair lines 
have a diameter of 0.5 mm (24 AWG). The maximum per-user total transmit power is 11.5 dBm for the 
VDSL scenarios and 20.4 dBm for the ADSL scenarios. The SNR gap T is 12.9 dB, corresponding to 
a coding gain of 3 dB, a noise margin of 6 dB, and a target symbol error probability of 10~ 7 . The tone 
spacing Aj is 4.3125 kHz. The DMT symbol rate f s is 4 kHz. 



A. Convergence speed up for iterative convex approximation based DSM 

A first DSL scenario is shown in Figure [3] This is a so-called near-far scenario which is known to be 
challenging, where DSM can make a substantial difference. For this scenario, we compare the convergence 
behaviour for the improved approach for CA-DSB (Algorithm [T) and the standard subgradient based 
dual decomposition approach for CA-DSB, where convergence is defined as achieving the optimal dual 
value of the convex approximation within accuracy 0.05%. The results are shown in Figure |4] For the 
subgradient scheme we used the stepsize update rule 8 = q/i, where q is the initial stepsize and i is 
the iteration counter |@J. This update rule is proven to converge to the optimal dual value. It can be 
observed that different initial stepsizes lead to a different convergence behaviour and this is generally 
difficult to tune. Note that for all initial stepsizes, the subgradient dual decomposition approach is still 
far from convergence after 500 iterations. The improved dual decomposition approach, on the other hand, 
automatically tunes its stepsize and converges very rapidly in only 40 iterations. 

CO 

Modem 1 Modem 1 
5000m 



RT1 



Modem 2 Modem 2 


3000m 




- 3000m 

















Fig. 3. 2-user near-far ADSL downstream scenario 
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Fig. 4. Comparison of convergence behaviour between subgradient dual decomposition approach, with different initial stepsizes 
q, and the improved dual decomposition approach, for CA-DSB 



B. Convergence speed up for direct DSM 

It was shown in |5] that for direct DSM algorithms the subgradient based dual decomposition approach 
with a particular stepsize selection procedure works well for ADSL scenarios, i.e. there are typically only 
50-100 subgradient iterations needed to converge to the optimal dual variables. However for multi-user 
VDSL scenarios, which use a much larger frequency range and have to cope with significantly more 
crosstalk interference, existing subgradient approaches [4] [5] are found to have significant convergence 
problems. We will focus on such VDSL scenarios and demonstrate how the improved approach succeeds 
in providing much faster convergence. 

The different VDSL scenarios are shown in Figures |5]|6] and [8] i.e. four-user VDSL upstream, six-user 
VDSL upstream, and six-user VDSL upstream scenario with a subset of strong symmetric crosstalkers, 
respectively. The weights w n are chosen equal for all users n, namely w n = l/N. Note that we used the 
multiple starting point procedure MS-DSB to solve the nonconvex per-tone problems for the subgradient 
based dual decomposition approach as well as the improved dual decomposition approach using d28l l. 
In lfTT1 it was shown that this procedure provides globally optimal performance for practical ADSL and 
VDSL scenarios. 
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The first scenario, shown in Figure [5] is a four-user upstream VDSL scenario, consisting of two far- 
users with line length 1200 m and two near-users with line length 300 m. In the higher frequency range, 
there is a significant crosstalk coupling. This is a near-far scenario where spectrum management is crucial 
as to avoid significant performance degradation for the far-end users. Note that the near-end users form 
a subset of strong symmetric crosstalkers, in the high frequency range. As mentioned in Section [V] this 
can cause significant convergence problems for the dual decomposition approach. In fact, simulations 
show that the subgradient methods in and fail to converge to the dual variables, i.e. after 20000 
iterations the complementarity conditions for some users are far from being satisfied. The main problem 
is that the stepsize selection procedure, which is a crucial component for fast convergence, is difficult 
to tune. For decreasing step sizes as proposed in 0], with different initial stepsizes, the procedure does 
not converge. For adaptive stepsizes, as proposed in [5], very small stepsizes are selected resulting in a 
very slow convergence (> 20000 iterations). It is observed that for some users there is a fast convergence 
to the corresponding complementarity conditions whereas for other users convergence is very slow. The 
presence of the strong subset of symmetric crosstalkers, can lead to large changes in primal variables 
for small changes in dual variables, as discussed in Section [V] if stepsizes are not tuned carefully. The 
improved approach of Algorithm [3] in contrary, converges very fast to the optimal dual and primal 
variables. In only 100 iterations convergence is obtained, within an accuracy of 0.05%. 
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Fig. 5. 4-user VDSL upstream scenario 
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The second VDSL upstream scenario, shown in Figure [6] consists of six users with different line 
lengths. Also for this large crosstalk scenario, the standard subgradient approaches [5 1 [4] fail to converge 
to the optimal dual variables, i.e. after 10000 iterations the complementarity conditions are far from being 
satisfied. Similarly to the scenario of Figure |5] one can observe very different convergence behaviour 
for the different users to the corresponding complementarity conditions, where typically for a few users 
convergence is very slow. The improved dual decomposition approach however converges to the optimal 
dual and primal variables in only 150 iterations, within an accuracy of 0.05%. The optimal transmit 
powers are shown in Figure [7] for illustration. 
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Fig. 6. 6-user VDSL upstream scenario 



The VDSL upstream scenario of Figure [8] consists of a six-line cable bundle with a subset of three 
strong symmetric crosstalkers, namely the set of lines with length 300m. The standard subgradient 
approaches JH (U fail to converge to the optimal dual variables. The presence of the strong symmetric 
crosstalkers significantly slows down the convergence, as it can lead to multiple globally optimal solutions 
for particular values of the dual variables. Here, stepsize selection is very crucial as a small change in dual 
variables can lead to a large change in primal variables, as also explained in Section [Vj The improved 
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Fig. 7. Optimal transmit powers for DSL scenario of Fig. [6] obtained using the improved dual decomposition approach. Blue, 
green, red, cyan, magenta and yellow curves correspond to transmit powers of users with line length 1200m, 1000m, 800m, 
600m, 450m and 300m respectively. 



dual decomposition approach converges to the optimal dual variables in only 150 iterations, but does not 
succeed in obtaining the primal optimal variables, because of the existence of multiple globally optimal 
solutions (i.e. optimal transmit powers) for optimal dual variables that do not satisfy the user total power 
constraints. More specifically for this scenario, for the obtained optimal dual variables, the obtained 
transmit powers jump to different solutions, with total powers {P 1 , P 2 , P 3 } = {p 1 ** ^ p2,tot ? p3,tot ^ 
and {P 4 ,P 5 ,P 6 } e {{3P tot , A, A}, {A, 3P tot , A}, {A, A, 3P tot }}, with A being very small. These 
primal solutions are shown in Figures |9] [10] and [TT] . One can observe that in the low and medium 
frequency range (used tones 1-727), the users with line lengths 1200 m, 900 m and 600 m are active. In 
this frequency range the strong crosstalkers will back-off and transmit at small similar transmit powers 
corresponding to a total power equal to A. However in the high frequency range (used tones 727-1147) 
where the users with line lengths 1200 m, 900 m and 600 are switched off, the three strong crosstalkers 
will compete, where only one user can be active in each tone k because of the significant crosstalk 
interference ll30l . As explained in Section [V] typical DSM algorithm implementations will select the 
same active user for each of these tones, namely the user that corresponds to the smallest dual variable, 
where the dual variable can be seen as a penalty. So instead of dividing the total power over the three 
users equally, which would lead to a primal solution satisfying the per-user total power constraints, one 
user gets all power, leading to P n = 3P n > tot for user n and P m = A for users m ^ n. Note that this 
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prevents convergence to the optimal primal variables satisfying the per-user total power constraints. 

However, when applying the proposed interleaving procedure (l32l) . as proposed in Section [V] together 
with the improved dual decomposition approach, we can observe a very fast convergence both in primal 
and dual variables. Convergence is achieved in only 150 iterations, within an accuracy of 0.05%. The 
obtained optimal transmit powers are shown in Figure [12] In the frequency range between tone 728 and 
tone 1147, one can observe the interleaving effect. In Figure [13] this is zoomed in for tones 970 up to 
975. 

Remark: In the practical implementation the first step of the interleaving procedure is changed to 
'all best solutions that are 99.9% close to each other'. This is to prevent that the procedure is only active 
when the dual variables are exactly the same. The overall effect of this is a negligible noise on the 
transmit powers as can be seen in Figure [12] 

Remark: Note that applying the interleaving procedure combined with the improved dual decompo- 
sition approach for the scenarios in Figures [5] and [6] also leads to a faster convergence in both dual and 
primal variables. 
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Fig. 8. 6-user VDSL upstream scenario with subset of strong symmetric crosstalkers 
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Fig. 9. Transmit powers for DSL scenario of Fig. [8] for optimal dual variables A* and with user total powers 
{P\P 2 ,P 3 ,P 4 ,P 5 ,P 6 } = {P 1 > tat ,P a > tot ,P 3 > tat ,3P*' tot ,A,A}, where A « P 4 ' tot . Blue, green, red, cyan, magenta, 
yellow curves correspond to transmit powers of users 1,2,3,4,5 and 6 respectively. 

-40 1 1 1 . 1 1 1 



-60 
E _80. 

I -loo- 

Q. 

E 

g -120- 

ra 

I— 

-140 - 



-160 1 1 1 1 1 1 

200 400 600 800 1000 1200 

Frequency tones (US1 + US2, VDSL bandplan 998) 

Fig. 10. Transmit powers for DSL scenario of Fig. [8] for optimal dual variables A* and with user total powers 

{P\P 2 ,P 3 ,P 4 ,P 5 ,P 6 } = {pl,tot p2,tot p3,tot A 3 p5,tot A ^ where A <<c p5,tot green> redi cyan> magenta; 

yellow curves correspond to transmit powers of users 1,2,3,4,5 and 6 respectively. 



VII. Conclusion 

Dynamic spectrum management has been recognized as a key technology to significantly improve 
the performance of DSL broadband access networks by mitigating the impact of crosstalk interference. 
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Fig. 11. Transmit powers for DSL scenario of Fig. [8] for optimal dual variables A* and with user total powers 
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Fig. 12. Transmit powers for scenario of DSL scenario of Fig. [8] obtained using improved dual decomposition approach with 
the interleaving procedure l !32t . Blue, green, red, cyan, magenta, yellow curves correspond to transmit powers of users 1,2,3,4,5 
and 6 respectively. 



Existing DSM algorithms use a standard subgradient based dual decomposition approach to tackle the 
corresponding nonconvex optimization problems. However, this standard approach is often found to lead 
to extremely slow convergence or even no convergence at all. Especially for multiuser VDSL scenarios 
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with subsets of strong symmetric crosstalkers significant convergence problems are observed because 
(1) the stepsize selection procedure of the subgradient updates is very critical, and (2) because special 
care must be taken when recovering the optimal transmit powers from the optimal dual solution. This 
paper proposes an improved dual decomposition approach, which consists of an optimal gradient based 
scheme with an automatic optimal stepsize selection removing the need for a tuning strategy. With this 
approach it is shown how the convergence of current state-of-the-art DSM algorithms, based on iterative 
convex approximations, is improved by one order of magnitude. The improved dual decomposition 
approach is also applied to other DSM algorithms (OSB, ISB, ASB, (MS)-DSB, MIW). The addition 
of an extra interleaving procedure for recovering the optimal transmit powers from the dual optimal 
solution furthermore improves the convergence of the proposed approach. Simulation results demonstrate 
that significant convergence speed ups are obtained using the proposed improved dual decomposition 
approach. 
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